Overview

Dataset statistics

Number of variables22
Number of observations721344
Missing cells99519
Missing cells (%)0.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory121.1 MiB
Average record size in memory176.0 B

Variable types

Numeric14
DateTime1
Categorical7

Warnings

origin has a high cardinality: 357 distinct values High cardinality
dest has a high cardinality: 357 distinct values High cardinality
crs_dep_time is highly correlated with wheels_off and 2 other fieldsHigh correlation
dep_delay is highly correlated with arr_delayHigh correlation
wheels_off is highly correlated with crs_dep_time and 2 other fieldsHigh correlation
wheels_on is highly correlated with crs_dep_time and 2 other fieldsHigh correlation
crs_arr_time is highly correlated with crs_dep_time and 2 other fieldsHigh correlation
arr_delay is highly correlated with dep_delay and 1 other fieldsHigh correlation
crs_elapsed_time is highly correlated with actual_elapsed_time and 2 other fieldsHigh correlation
actual_elapsed_time is highly correlated with crs_elapsed_time and 2 other fieldsHigh correlation
air_time is highly correlated with crs_elapsed_time and 2 other fieldsHigh correlation
distance is highly correlated with crs_elapsed_time and 2 other fieldsHigh correlation
delayed is highly correlated with arr_delayHigh correlation
crs_dep_time is highly correlated with wheels_off and 2 other fieldsHigh correlation
dep_delay is highly correlated with arr_delay and 1 other fieldsHigh correlation
wheels_off is highly correlated with crs_dep_time and 2 other fieldsHigh correlation
wheels_on is highly correlated with crs_dep_time and 2 other fieldsHigh correlation
crs_arr_time is highly correlated with crs_dep_time and 2 other fieldsHigh correlation
arr_delay is highly correlated with dep_delay and 1 other fieldsHigh correlation
crs_elapsed_time is highly correlated with actual_elapsed_time and 2 other fieldsHigh correlation
actual_elapsed_time is highly correlated with crs_elapsed_time and 2 other fieldsHigh correlation
air_time is highly correlated with crs_elapsed_time and 2 other fieldsHigh correlation
distance is highly correlated with crs_elapsed_time and 2 other fieldsHigh correlation
delayed is highly correlated with dep_delay and 1 other fieldsHigh correlation
crs_dep_time is highly correlated with wheels_off and 2 other fieldsHigh correlation
dep_delay is highly correlated with arr_delayHigh correlation
wheels_off is highly correlated with crs_dep_time and 2 other fieldsHigh correlation
wheels_on is highly correlated with crs_dep_time and 2 other fieldsHigh correlation
crs_arr_time is highly correlated with crs_dep_time and 2 other fieldsHigh correlation
arr_delay is highly correlated with dep_delay and 1 other fieldsHigh correlation
crs_elapsed_time is highly correlated with actual_elapsed_time and 2 other fieldsHigh correlation
actual_elapsed_time is highly correlated with crs_elapsed_time and 2 other fieldsHigh correlation
air_time is highly correlated with crs_elapsed_time and 2 other fieldsHigh correlation
distance is highly correlated with crs_elapsed_time and 2 other fieldsHigh correlation
delayed is highly correlated with arr_delayHigh correlation
dep_delay is highly correlated with arr_delayHigh correlation
distance is highly correlated with air_time and 2 other fieldsHigh correlation
wheels_on is highly correlated with crs_arr_time and 2 other fieldsHigh correlation
crs_arr_time is highly correlated with wheels_on and 2 other fieldsHigh correlation
df_index is highly correlated with weekHigh correlation
arr_delay is highly correlated with dep_delayHigh correlation
air_time is highly correlated with distance and 2 other fieldsHigh correlation
week is highly correlated with df_indexHigh correlation
wheels_off is highly correlated with wheels_on and 2 other fieldsHigh correlation
actual_elapsed_time is highly correlated with distance and 2 other fieldsHigh correlation
crs_dep_time is highly correlated with wheels_on and 2 other fieldsHigh correlation
crs_elapsed_time is highly correlated with distance and 2 other fieldsHigh correlation
dep_delay has 11772 (1.6%) missing values Missing
taxi_out has 11624 (1.6%) missing values Missing
wheels_off has 11624 (1.6%) missing values Missing
wheels_on has 11950 (1.7%) missing values Missing
taxi_in has 11950 (1.7%) missing values Missing
arr_delay has 13704 (1.9%) missing values Missing
actual_elapsed_time has 13447 (1.9%) missing values Missing
air_time has 13447 (1.9%) missing values Missing
df_index has unique values Unique
dep_delay has 35060 (4.9%) zeros Zeros
arr_delay has 13971 (1.9%) zeros Zeros

Reproduction

Analysis started2021-08-22 23:38:55.967045
Analysis finished2021-08-22 23:42:52.878756
Duration3 minutes and 56.91 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct721344
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3604130.289
Minimum18
Maximum7213425
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.5 MiB

Quantile statistics

Minimum18
5-th percentile360687.3
Q11802147.5
median3601846
Q35406346.25
95-th percentile6848931.7
Maximum7213425
Range7213407
Interquartile range (IQR)3604198.75

Descriptive statistics

Standard deviation2081222.031
Coefficient of variation (CV)0.5774547155
Kurtosis-1.200276611
Mean3604130.289
Median Absolute Deviation (MAD)1802139
Skewness0.001414650665
Sum2.599817759 × 1012
Variance4.331485142 × 1012
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
31498221
 
< 0.1%
58264861
 
< 0.1%
71610901
 
< 0.1%
3543191
 
< 0.1%
22725111
 
< 0.1%
6268511
 
< 0.1%
43614671
 
< 0.1%
43594181
 
< 0.1%
33169851
 
< 0.1%
26259891
 
< 0.1%
Other values (721334)721334
> 99.9%
ValueCountFrequency (%)
181
< 0.1%
321
< 0.1%
371
< 0.1%
441
< 0.1%
511
< 0.1%
681
< 0.1%
941
< 0.1%
1141
< 0.1%
1271
< 0.1%
1421
< 0.1%
ValueCountFrequency (%)
72134251
< 0.1%
72133771
< 0.1%
72133581
< 0.1%
72133371
< 0.1%
72133231
< 0.1%
72133161
< 0.1%
72133081
< 0.1%
72133051
< 0.1%
72132991
< 0.1%
72132961
< 0.1%
Distinct365
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size5.5 MiB
Minimum2018-01-01 00:00:00
Maximum2018-12-31 00:00:00
Histogram with fixed size bins (bins=50)

op_carrier
Categorical

Distinct18
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.5 MiB
WN
135087 
DL
94694 
AA
92050 
OO
77427 
UA
62102 
Other values (13)
259984 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1442688
Distinct characters22
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAS
2nd rowEV
3rd rowOO
4th rowHA
5th rowWN

Common Values

ValueCountFrequency (%)
WN135087
18.7%
DL94694
13.1%
AA92050
12.8%
OO77427
10.7%
UA62102
8.6%
YX31511
 
4.4%
B630512
 
4.2%
MQ29577
 
4.1%
OH27888
 
3.9%
AS24577
 
3.4%
Other values (8)115919
16.1%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
wn135087
18.7%
dl94694
13.1%
aa92050
12.8%
oo77427
10.7%
ua62102
8.6%
yx31511
 
4.4%
b630512
 
4.2%
mq29577
 
4.1%
oh27888
 
3.9%
as24577
 
3.4%
Other values (8)115919
16.1%

Most occurring characters

ValueCountFrequency (%)
A279226
19.4%
O182742
12.7%
N152866
10.6%
W135087
9.4%
D94694
 
6.6%
L94694
 
6.6%
U62102
 
4.3%
Y52948
 
3.7%
E44941
 
3.1%
V43778
 
3.0%
Other values (12)299610
20.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter1366261
94.7%
Decimal Number76427
 
5.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A279226
20.4%
O182742
13.4%
N152866
11.2%
W135087
9.9%
D94694
 
6.9%
L94694
 
6.9%
U62102
 
4.5%
Y52948
 
3.9%
E44941
 
3.3%
V43778
 
3.2%
Other values (9)223183
16.3%
Decimal Number
ValueCountFrequency (%)
936412
47.6%
630512
39.9%
49503
 
12.4%

Most occurring scripts

ValueCountFrequency (%)
Latin1366261
94.7%
Common76427
 
5.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
A279226
20.4%
O182742
13.4%
N152866
11.2%
W135087
9.9%
D94694
 
6.9%
L94694
 
6.9%
U62102
 
4.5%
Y52948
 
3.9%
E44941
 
3.3%
V43778
 
3.2%
Other values (9)223183
16.3%
Common
ValueCountFrequency (%)
936412
47.6%
630512
39.9%
49503
 
12.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII1442688
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A279226
19.4%
O182742
12.7%
N152866
10.6%
W135087
9.4%
D94694
 
6.6%
L94694
 
6.6%
U62102
 
4.3%
Y52948
 
3.7%
E44941
 
3.1%
V43778
 
3.0%
Other values (12)299610
20.8%

origin
Categorical

HIGH CARDINALITY

Distinct357
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.5 MiB
ATL
 
38995
ORD
 
33262
DFW
 
27929
DEN
 
23622
CLT
 
23266
Other values (352)
574270 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2164032
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSFO
2nd rowORD
3rd rowDEN
4th rowLIH
5th rowMCO

Common Values

ValueCountFrequency (%)
ATL38995
 
5.4%
ORD33262
 
4.6%
DFW27929
 
3.9%
DEN23622
 
3.3%
CLT23266
 
3.2%
LAX22215
 
3.1%
SFO17722
 
2.5%
IAH17603
 
2.4%
PHX17429
 
2.4%
LGA17049
 
2.4%
Other values (347)482252
66.9%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
atl38995
 
5.4%
ord33262
 
4.6%
dfw27929
 
3.9%
den23622
 
3.3%
clt23266
 
3.2%
lax22215
 
3.1%
sfo17722
 
2.5%
iah17603
 
2.4%
phx17429
 
2.4%
lga17049
 
2.4%
Other values (347)482252
66.9%

Most occurring characters

ValueCountFrequency (%)
A241438
 
11.2%
L205975
 
9.5%
S176144
 
8.1%
D169572
 
7.8%
T123311
 
5.7%
O116296
 
5.4%
C109935
 
5.1%
M96136
 
4.4%
F90890
 
4.2%
W85445
 
3.9%
Other values (16)748890
34.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter2164032
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A241438
 
11.2%
L205975
 
9.5%
S176144
 
8.1%
D169572
 
7.8%
T123311
 
5.7%
O116296
 
5.4%
C109935
 
5.1%
M96136
 
4.4%
F90890
 
4.2%
W85445
 
3.9%
Other values (16)748890
34.6%

Most occurring scripts

ValueCountFrequency (%)
Latin2164032
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A241438
 
11.2%
L205975
 
9.5%
S176144
 
8.1%
D169572
 
7.8%
T123311
 
5.7%
O116296
 
5.4%
C109935
 
5.1%
M96136
 
4.4%
F90890
 
4.2%
W85445
 
3.9%
Other values (16)748890
34.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII2164032
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A241438
 
11.2%
L205975
 
9.5%
S176144
 
8.1%
D169572
 
7.8%
T123311
 
5.7%
O116296
 
5.4%
C109935
 
5.1%
M96136
 
4.4%
F90890
 
4.2%
W85445
 
3.9%
Other values (16)748890
34.6%

dest
Categorical

HIGH CARDINALITY

Distinct357
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.5 MiB
ATL
 
38998
ORD
 
33277
DFW
 
27926
DEN
 
23519
CLT
 
23211
Other values (352)
574413 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2164032
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowLAX
2nd rowSGF
3rd rowSUN
4th rowHNL
5th rowALB

Common Values

ValueCountFrequency (%)
ATL38998
 
5.4%
ORD33277
 
4.6%
DFW27926
 
3.9%
DEN23519
 
3.3%
CLT23211
 
3.2%
LAX21953
 
3.0%
PHX17562
 
2.4%
IAH17554
 
2.4%
SFO17495
 
2.4%
LGA17067
 
2.4%
Other values (347)482782
66.9%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
atl38998
 
5.4%
ord33277
 
4.6%
dfw27926
 
3.9%
den23519
 
3.3%
clt23211
 
3.2%
lax21953
 
3.0%
phx17562
 
2.4%
iah17554
 
2.4%
sfo17495
 
2.4%
lga17067
 
2.4%
Other values (347)482782
66.9%

Most occurring characters

ValueCountFrequency (%)
A241375
 
11.2%
L205560
 
9.5%
S175611
 
8.1%
D169840
 
7.8%
T123427
 
5.7%
O116520
 
5.4%
C109799
 
5.1%
M96067
 
4.4%
F90573
 
4.2%
W85276
 
3.9%
Other values (16)749984
34.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter2164032
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A241375
 
11.2%
L205560
 
9.5%
S175611
 
8.1%
D169840
 
7.8%
T123427
 
5.7%
O116520
 
5.4%
C109799
 
5.1%
M96067
 
4.4%
F90573
 
4.2%
W85276
 
3.9%
Other values (16)749984
34.7%

Most occurring scripts

ValueCountFrequency (%)
Latin2164032
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A241375
 
11.2%
L205560
 
9.5%
S175611
 
8.1%
D169840
 
7.8%
T123427
 
5.7%
O116520
 
5.4%
C109799
 
5.1%
M96067
 
4.4%
F90573
 
4.2%
W85276
 
3.9%
Other values (16)749984
34.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII2164032
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A241375
 
11.2%
L205560
 
9.5%
S175611
 
8.1%
D169840
 
7.8%
T123427
 
5.7%
O116520
 
5.4%
C109799
 
5.1%
M96067
 
4.4%
F90573
 
4.2%
W85276
 
3.9%
Other values (16)749984
34.7%

crs_dep_time
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1331
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1330.840573
Minimum1
Maximum2359
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.5 MiB

Quantile statistics

Minimum1
5-th percentile605
Q1915
median1323
Q31735
95-th percentile2130
Maximum2359
Range2358
Interquartile range (IQR)820

Descriptive statistics

Standard deviation490.7052696
Coefficient of variation (CV)0.368718297
Kurtosis-1.03596056
Mean1330.840573
Median Absolute Deviation (MAD)412
Skewness0.06418971409
Sum959993862
Variance240791.6616
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
60014320
 
2.0%
7009834
 
1.4%
8005815
 
0.8%
8304496
 
0.6%
6304193
 
0.6%
9004158
 
0.6%
10004069
 
0.6%
7304049
 
0.6%
17003843
 
0.5%
12003792
 
0.5%
Other values (1321)662775
91.9%
ValueCountFrequency (%)
112
 
< 0.1%
21
 
< 0.1%
38
 
< 0.1%
420
 
< 0.1%
572
< 0.1%
65
 
< 0.1%
78
 
< 0.1%
82
 
< 0.1%
99
 
< 0.1%
1047
< 0.1%
ValueCountFrequency (%)
2359631
0.1%
235853
 
< 0.1%
235750
 
< 0.1%
235646
 
< 0.1%
2355299
< 0.1%
235435
 
< 0.1%
235332
 
< 0.1%
235215
 
< 0.1%
235122
 
< 0.1%
2350174
 
< 0.1%

dep_delay
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct994
Distinct (%)0.1%
Missing11772
Missing (%)1.6%
Infinite0
Infinite (%)0.0%
Mean9.982390793
Minimum-57
Maximum1861
Zeros35060
Zeros (%)4.9%
Negative429744
Negative (%)59.6%
Memory size5.5 MiB

Quantile statistics

Minimum-57
5-th percentile-10
Q1-5
median-2
Q37
95-th percentile73
Maximum1861
Range1918
Interquartile range (IQR)12

Descriptive statistics

Standard deviation44.81726599
Coefficient of variation (CV)4.489632486
Kurtosis165.7336724
Mean9.982390793
Median Absolute Deviation (MAD)4
Skewness9.509406521
Sum7083225
Variance2008.587331
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-555742
 
7.7%
-454559
 
7.6%
-353272
 
7.4%
-247828
 
6.6%
-644858
 
6.2%
-141606
 
5.8%
-736590
 
5.1%
035060
 
4.9%
-827533
 
3.8%
-920193
 
2.8%
Other values (984)292331
40.5%
ValueCountFrequency (%)
-571
 
< 0.1%
-511
 
< 0.1%
-491
 
< 0.1%
-471
 
< 0.1%
-452
< 0.1%
-431
 
< 0.1%
-421
 
< 0.1%
-411
 
< 0.1%
-404
< 0.1%
-391
 
< 0.1%
ValueCountFrequency (%)
18611
< 0.1%
15761
< 0.1%
15591
< 0.1%
15311
< 0.1%
15281
< 0.1%
15221
< 0.1%
15181
< 0.1%
14861
< 0.1%
14601
< 0.1%
14171
< 0.1%

taxi_out
Real number (ℝ≥0)

MISSING

Distinct169
Distinct (%)< 0.1%
Missing11624
Missing (%)1.6%
Infinite0
Infinite (%)0.0%
Mean17.3887829
Minimum1
Maximum180
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.5 MiB

Quantile statistics

Minimum1
5-th percentile8
Q111
median15
Q320
95-th percentile35
Maximum180
Range179
Interquartile range (IQR)9

Descriptive statistics

Standard deviation9.878590816
Coefficient of variation (CV)0.5681013371
Kurtosis20.40220955
Mean17.3887829
Median Absolute Deviation (MAD)4
Skewness3.236346317
Sum12341167
Variance97.58655651
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1253729
 
7.4%
1352366
 
7.3%
1152173
 
7.2%
1448393
 
6.7%
1046901
 
6.5%
1544060
 
6.1%
1639070
 
5.4%
936981
 
5.1%
1734497
 
4.8%
1830384
 
4.2%
Other values (159)271166
37.6%
ValueCountFrequency (%)
110
 
< 0.1%
220
 
< 0.1%
3135
 
< 0.1%
4485
 
0.1%
51852
 
0.3%
66368
 
0.9%
714327
 
2.0%
825268
3.5%
936981
5.1%
1046901
6.5%
ValueCountFrequency (%)
1801
 
< 0.1%
1771
 
< 0.1%
1761
 
< 0.1%
1751
 
< 0.1%
1731
 
< 0.1%
1661
 
< 0.1%
1653
< 0.1%
1632
< 0.1%
1622
< 0.1%
1612
< 0.1%

wheels_off
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct1431
Distinct (%)0.2%
Missing11624
Missing (%)1.6%
Infinite0
Infinite (%)0.0%
Mean1359.002085
Minimum1
Maximum2400
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.5 MiB

Quantile statistics

Minimum1
5-th percentile616
Q1932
median1341
Q31759
95-th percentile2155
Maximum2400
Range2399
Interquartile range (IQR)827

Descriptive statistics

Standard deviation505.8791287
Coefficient of variation (CV)0.3722430849
Kurtosis-0.9238495754
Mean1359.002085
Median Absolute Deviation (MAD)414
Skewness-0.005912553281
Sum964510960
Variance255913.6928
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6101203
 
0.2%
6091178
 
0.2%
6111172
 
0.2%
6081155
 
0.2%
6131153
 
0.2%
6121144
 
0.2%
6071103
 
0.2%
6141101
 
0.2%
7101030
 
0.1%
7091022
 
0.1%
Other values (1421)698459
96.8%
(Missing)11624
 
1.6%
ValueCountFrequency (%)
1129
< 0.1%
2107
< 0.1%
399
< 0.1%
4107
< 0.1%
5114
< 0.1%
6125
< 0.1%
7103
< 0.1%
897
< 0.1%
990
< 0.1%
10121
< 0.1%
ValueCountFrequency (%)
240085
< 0.1%
2359132
< 0.1%
2358103
< 0.1%
2357107
< 0.1%
2356114
< 0.1%
2355110
< 0.1%
2354101
< 0.1%
2353115
< 0.1%
2352127
< 0.1%
2351102
< 0.1%

wheels_on
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct1440
Distinct (%)0.2%
Missing11950
Missing (%)1.7%
Infinite0
Infinite (%)0.0%
Mean1463.093707
Minimum1
Maximum2400
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.5 MiB

Quantile statistics

Minimum1
5-th percentile643
Q11045
median1503
Q31912
95-th percentile2249
Maximum2400
Range2399
Interquartile range (IQR)867

Descriptive statistics

Standard deviation533.6179693
Coefficient of variation (CV)0.3647189287
Kurtosis-0.4533442863
Mean1463.093707
Median Absolute Deviation (MAD)423
Skewness-0.3279631239
Sum1037909897
Variance284748.1372
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2104815
 
0.1%
1844799
 
0.1%
1650797
 
0.1%
1854794
 
0.1%
1850791
 
0.1%
2106791
 
0.1%
1904787
 
0.1%
1644786
 
0.1%
1710785
 
0.1%
1615785
 
0.1%
Other values (1430)701464
97.2%
(Missing)11950
 
1.7%
ValueCountFrequency (%)
1398
0.1%
2329
< 0.1%
3323
< 0.1%
4299
< 0.1%
5294
< 0.1%
6318
< 0.1%
7305
< 0.1%
8299
< 0.1%
9281
< 0.1%
10261
< 0.1%
ValueCountFrequency (%)
2400249
< 0.1%
2359355
< 0.1%
2358349
< 0.1%
2357390
0.1%
2356368
0.1%
2355379
0.1%
2354349
< 0.1%
2353366
0.1%
2352390
0.1%
2351400
0.1%

taxi_in
Real number (ℝ≥0)

MISSING

Distinct150
Distinct (%)< 0.1%
Missing11950
Missing (%)1.7%
Infinite0
Infinite (%)0.0%
Mean7.601802383
Minimum1
Maximum258
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.5 MiB

Quantile statistics

Minimum1
5-th percentile3
Q14
median6
Q39
95-th percentile18
Maximum258
Range257
Interquartile range (IQR)5

Descriptive statistics

Standard deviation6.071801211
Coefficient of variation (CV)0.7987317882
Kurtosis51.51948995
Mean7.601802383
Median Absolute Deviation (MAD)2
Skewness4.620734385
Sum5392673
Variance36.86676995
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4107022
14.8%
5100850
14.0%
682150
11.4%
380221
11.1%
765603
9.1%
848874
6.8%
937974
 
5.3%
1029575
 
4.1%
226371
 
3.7%
1123274
 
3.2%
Other values (140)107480
14.9%
ValueCountFrequency (%)
11771
 
0.2%
226371
 
3.7%
380221
11.1%
4107022
14.8%
5100850
14.0%
682150
11.4%
765603
9.1%
848874
6.8%
937974
 
5.3%
1029575
 
4.1%
ValueCountFrequency (%)
2582
< 0.1%
1821
< 0.1%
1801
< 0.1%
1771
< 0.1%
1761
< 0.1%
1731
< 0.1%
1611
< 0.1%
1591
< 0.1%
1581
< 0.1%
1531
< 0.1%

crs_arr_time
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1405
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1487.061976
Minimum1
Maximum2400
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.5 MiB

Quantile statistics

Minimum1
5-th percentile714
Q11101
median1516
Q31919
95-th percentile2255
Maximum2400
Range2399
Interquartile range (IQR)818

Descriptive statistics

Standard deviation518.5120298
Coefficient of variation (CV)0.3486821923
Kurtosis-0.4636437834
Mean1487.061976
Median Absolute Deviation (MAD)409
Skewness-0.3024625396
Sum1072683234
Variance268854.725
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
21002115
 
0.3%
17002044
 
0.3%
19002002
 
0.3%
11301996
 
0.3%
18551981
 
0.3%
21151952
 
0.3%
21251909
 
0.3%
21201870
 
0.3%
19251852
 
0.3%
9001839
 
0.3%
Other values (1395)701784
97.3%
ValueCountFrequency (%)
1247
 
< 0.1%
2187
 
< 0.1%
3220
 
< 0.1%
4213
 
< 0.1%
5746
0.1%
6168
 
< 0.1%
7178
 
< 0.1%
8154
 
< 0.1%
9247
 
< 0.1%
10643
0.1%
ValueCountFrequency (%)
240010
 
< 0.1%
23591224
0.2%
2358547
 
0.1%
2357587
0.1%
2356581
0.1%
23551394
0.2%
2354529
 
0.1%
2353403
 
0.1%
2352333
 
< 0.1%
2351383
 
0.1%

arr_delay
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct1015
Distinct (%)0.1%
Missing13704
Missing (%)1.9%
Infinite0
Infinite (%)0.0%
Mean5.045278673
Minimum-102
Maximum1861
Zeros13971
Zeros (%)1.9%
Negative442362
Negative (%)61.3%
Memory size5.5 MiB

Quantile statistics

Minimum-102
5-th percentile-26
Q1-14
median-6
Q38
95-th percentile73
Maximum1861
Range1963
Interquartile range (IQR)22

Descriptive statistics

Standard deviation46.9310833
Coefficient of variation (CV)9.301980394
Kurtosis138.8686884
Mean5.045278673
Median Absolute Deviation (MAD)10
Skewness8.388702025
Sum3570241
Variance2202.52658
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-1121072
 
2.9%
-1020992
 
2.9%
-920958
 
2.9%
-820805
 
2.9%
-1220680
 
2.9%
-720293
 
2.8%
-1320221
 
2.8%
-619527
 
2.7%
-1419199
 
2.7%
-518677
 
2.6%
Other values (1005)505216
70.0%
ValueCountFrequency (%)
-1021
 
< 0.1%
-911
 
< 0.1%
-831
 
< 0.1%
-811
 
< 0.1%
-771
 
< 0.1%
-744
< 0.1%
-732
< 0.1%
-722
< 0.1%
-714
< 0.1%
-703
< 0.1%
ValueCountFrequency (%)
18611
< 0.1%
15761
< 0.1%
15581
< 0.1%
15431
< 0.1%
15231
< 0.1%
15151
< 0.1%
15061
< 0.1%
14771
< 0.1%
14491
< 0.1%
14301
< 0.1%

cancelled
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.5 MiB
0.0
709644 
1.0
 
11700

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2164032
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0709644
98.4%
1.011700
 
1.6%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
0.0709644
98.4%
1.011700
 
1.6%

Most occurring characters

ValueCountFrequency (%)
01430988
66.1%
.721344
33.3%
111700
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1442688
66.7%
Other Punctuation721344
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01430988
99.2%
111700
 
0.8%
Other Punctuation
ValueCountFrequency (%)
.721344
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common2164032
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01430988
66.1%
.721344
33.3%
111700
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII2164032
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01430988
66.1%
.721344
33.3%
111700
 
0.5%

diverted
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.5 MiB
0.0
719597 
1.0
 
1747

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2164032
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row1.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0719597
99.8%
1.01747
 
0.2%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
0.0719597
99.8%
1.01747
 
0.2%

Most occurring characters

ValueCountFrequency (%)
01440941
66.6%
.721344
33.3%
11747
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1442688
66.7%
Other Punctuation721344
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01440941
99.9%
11747
 
0.1%
Other Punctuation
ValueCountFrequency (%)
.721344
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common2164032
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01440941
66.6%
.721344
33.3%
11747
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII2164032
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01440941
66.6%
.721344
33.3%
11747
 
0.1%

crs_elapsed_time
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct546
Distinct (%)0.1%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean141.1707246
Minimum21
Maximum703
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.5 MiB

Quantile statistics

Minimum21
5-th percentile60
Q188
median122
Q3171
95-th percentile304
Maximum703
Range682
Interquartile range (IQR)83

Descriptive statistics

Standard deviation73.39593725
Coefficient of variation (CV)0.5199090495
Kurtosis2.308242928
Mean141.1707246
Median Absolute Deviation (MAD)39
Skewness1.431518039
Sum101832514
Variance5386.963604
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9013152
 
1.8%
8513078
 
1.8%
8012447
 
1.7%
7011571
 
1.6%
7510669
 
1.5%
659956
 
1.4%
959716
 
1.3%
1109325
 
1.3%
1009165
 
1.3%
1058807
 
1.2%
Other values (536)613457
85.0%
ValueCountFrequency (%)
2122
< 0.1%
2214
< 0.1%
2319
< 0.1%
246
 
< 0.1%
251
 
< 0.1%
264
 
< 0.1%
275
 
< 0.1%
309
 
< 0.1%
3130
< 0.1%
324
 
< 0.1%
ValueCountFrequency (%)
7031
 
< 0.1%
6956
< 0.1%
6905
< 0.1%
6834
< 0.1%
6818
< 0.1%
6793
 
< 0.1%
6751
 
< 0.1%
6725
< 0.1%
6703
 
< 0.1%
6585
< 0.1%

actual_elapsed_time
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct637
Distinct (%)0.1%
Missing13447
Missing (%)1.9%
Infinite0
Infinite (%)0.0%
Mean136.4907592
Minimum16
Maximum739
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.5 MiB

Quantile statistics

Minimum16
5-th percentile56
Q183
median118
Q3167
95-th percentile297
Maximum739
Range723
Interquartile range (IQR)84

Descriptive statistics

Standard deviation73.15503209
Coefficient of variation (CV)0.5359705851
Kurtosis2.288041661
Mean136.4907592
Median Absolute Deviation (MAD)39
Skewness1.417156948
Sum96621399
Variance5351.658721
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
796088
 
0.8%
775955
 
0.8%
805953
 
0.8%
825908
 
0.8%
815890
 
0.8%
855881
 
0.8%
765859
 
0.8%
785822
 
0.8%
755783
 
0.8%
835706
 
0.8%
Other values (627)649052
90.0%
(Missing)13447
 
1.9%
ValueCountFrequency (%)
161
 
< 0.1%
172
 
< 0.1%
185
< 0.1%
195
< 0.1%
204
 
< 0.1%
214
 
< 0.1%
224
 
< 0.1%
2310
< 0.1%
248
< 0.1%
254
 
< 0.1%
ValueCountFrequency (%)
7391
 
< 0.1%
7161
 
< 0.1%
7151
 
< 0.1%
7141
 
< 0.1%
7011
 
< 0.1%
6873
< 0.1%
6862
< 0.1%
6791
 
< 0.1%
6761
 
< 0.1%
6743
< 0.1%

air_time
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct614
Distinct (%)0.1%
Missing13447
Missing (%)1.9%
Infinite0
Infinite (%)0.0%
Mean111.5125873
Minimum7
Maximum688
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.5 MiB

Quantile statistics

Minimum7
5-th percentile34
Q160
median92
Q3141
95-th percentile270
Maximum688
Range681
Interquartile range (IQR)81

Descriptive statistics

Standard deviation71.13892974
Coefficient of variation (CV)0.6379452892
Kurtosis2.323422908
Mean111.5125873
Median Absolute Deviation (MAD)38
Skewness1.441227262
Sum78939426
Variance5060.747325
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
626608
 
0.9%
636565
 
0.9%
616530
 
0.9%
606481
 
0.9%
586433
 
0.9%
646379
 
0.9%
656360
 
0.9%
596356
 
0.9%
456231
 
0.9%
576205
 
0.9%
Other values (604)643749
89.2%
(Missing)13447
 
1.9%
ValueCountFrequency (%)
71
 
< 0.1%
86
 
< 0.1%
914
 
< 0.1%
1011
 
< 0.1%
1110
 
< 0.1%
128
 
< 0.1%
1314
 
< 0.1%
1440
 
< 0.1%
15105
< 0.1%
16164
< 0.1%
ValueCountFrequency (%)
6881
< 0.1%
6871
< 0.1%
6781
< 0.1%
6751
< 0.1%
6661
< 0.1%
6651
< 0.1%
6591
< 0.1%
6572
< 0.1%
6541
< 0.1%
6501
< 0.1%

distance
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1536
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean800.3805147
Minimum31
Maximum4983
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.5 MiB

Quantile statistics

Minimum31
5-th percentile164
Q1363
median632
Q31034
95-th percentile2176
Maximum4983
Range4952
Interquartile range (IQR)671

Descriptive statistics

Standard deviation598.6037086
Coefficient of variation (CV)0.7478989026
Kurtosis2.44681817
Mean800.3805147
Median Absolute Deviation (MAD)315
Skewness1.477416356
Sum577349682
Variance358326.4
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3375057
 
0.7%
7333597
 
0.5%
2963486
 
0.5%
5943032
 
0.4%
2142968
 
0.4%
3992946
 
0.4%
4472899
 
0.4%
4042874
 
0.4%
8672791
 
0.4%
5882717
 
0.4%
Other values (1526)688977
95.5%
ValueCountFrequency (%)
3164
 
< 0.1%
4115
 
< 0.1%
558
 
< 0.1%
66178
 
< 0.1%
67416
0.1%
68135
 
< 0.1%
69157
 
< 0.1%
7015
 
< 0.1%
73522
0.1%
74464
0.1%
ValueCountFrequency (%)
498375
< 0.1%
496279
< 0.1%
481734
< 0.1%
450277
< 0.1%
424377
< 0.1%
418439
< 0.1%
397245
< 0.1%
390466
< 0.1%
38474
 
< 0.1%
380182
< 0.1%

delayed
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.5 MiB
0
470037 
1
251307 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters721344
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0470037
65.2%
1251307
34.8%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
0470037
65.2%
1251307
34.8%

Most occurring characters

ValueCountFrequency (%)
0470037
65.2%
1251307
34.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number721344
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0470037
65.2%
1251307
34.8%

Most occurring scripts

ValueCountFrequency (%)
Common721344
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0470037
65.2%
1251307
34.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII721344
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0470037
65.2%
1251307
34.8%

day
Categorical

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.5 MiB
Monday
108362 
Friday
107919 
Thursday
107062 
Wednesday
104693 
Tuesday
103191 
Other values (2)
190117 

Length

Max length9
Median length7
Mean length7.119804143
Min length6

Characters and Unicode

Total characters5135828
Distinct characters17
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSunday
2nd rowSunday
3rd rowFriday
4th rowSunday
5th rowTuesday

Common Values

ValueCountFrequency (%)
Monday108362
15.0%
Friday107919
15.0%
Thursday107062
14.8%
Wednesday104693
14.5%
Tuesday103191
14.3%
Sunday101932
14.1%
Saturday88185
12.2%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
monday108362
15.0%
friday107919
15.0%
thursday107062
14.8%
wednesday104693
14.5%
tuesday103191
14.3%
sunday101932
14.1%
saturday88185
12.2%

Most occurring characters

ValueCountFrequency (%)
d826037
16.1%
a809529
15.8%
y721344
14.0%
u400370
7.8%
n314987
 
6.1%
s314946
 
6.1%
e312577
 
6.1%
r303166
 
5.9%
T210253
 
4.1%
S190117
 
3.7%
Other values (7)732502
14.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter4414484
86.0%
Uppercase Letter721344
 
14.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
d826037
18.7%
a809529
18.3%
y721344
16.3%
u400370
9.1%
n314987
 
7.1%
s314946
 
7.1%
e312577
 
7.1%
r303166
 
6.9%
o108362
 
2.5%
i107919
 
2.4%
Other values (2)195247
 
4.4%
Uppercase Letter
ValueCountFrequency (%)
T210253
29.1%
S190117
26.4%
M108362
15.0%
F107919
15.0%
W104693
14.5%

Most occurring scripts

ValueCountFrequency (%)
Latin5135828
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
d826037
16.1%
a809529
15.8%
y721344
14.0%
u400370
7.8%
n314987
 
6.1%
s314946
 
6.1%
e312577
 
6.1%
r303166
 
5.9%
T210253
 
4.1%
S190117
 
3.7%
Other values (7)732502
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII5135828
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
d826037
16.1%
a809529
15.8%
y721344
14.0%
u400370
7.8%
n314987
 
6.1%
s314946
 
6.1%
e312577
 
6.1%
r303166
 
5.9%
T210253
 
4.1%
S190117
 
3.7%
Other values (7)732502
14.3%

week
Real number (ℝ≥0)

HIGH CORRELATION

Distinct53
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.7341823
Minimum1
Maximum53
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.5 MiB

Quantile statistics

Minimum1
5-th percentile3
Q114
median27
Q339
95-th percentile50
Maximum53
Range52
Interquartile range (IQR)25

Descriptive statistics

Standard deviation14.8156822
Coefficient of variation (CV)0.5541849767
Kurtosis-1.163709087
Mean26.7341823
Median Absolute Deviation (MAD)13
Skewness-0.00966330638
Sum19284542
Variance219.504439
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2514969
 
2.1%
2814916
 
2.1%
3114875
 
2.1%
2614829
 
2.1%
2914810
 
2.1%
2414787
 
2.0%
3014736
 
2.0%
3214716
 
2.0%
3314585
 
2.0%
2314357
 
2.0%
Other values (43)573764
79.5%
ValueCountFrequency (%)
113101
1.8%
212790
1.8%
312547
1.7%
412693
1.8%
512557
1.7%
612997
1.8%
712960
1.8%
813452
1.9%
913350
1.9%
1013877
1.9%
ValueCountFrequency (%)
531678
 
0.2%
5213213
1.8%
5114120
2.0%
5013096
1.8%
4913304
1.8%
4813816
1.9%
4712990
1.8%
4613965
1.9%
4513711
1.9%
4413325
1.8%

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

df_indexfl_dateop_carrierorigindestcrs_dep_timedep_delaytaxi_outwheels_offwheels_ontaxi_incrs_arr_timearr_delaycancelleddivertedcrs_elapsed_timeactual_elapsed_timeair_timedistancedelayeddayweek
028328252018-05-27ASSFOLAX2105-3.016.02118.02211.010.02232-11.00.00.087.079.053.0337.00Sunday21
12419952018-01-14EVORDSGF1555-2.015.01608.01718.08.01744-18.00.00.0109.093.070.0438.00Sunday02
23396992018-01-19OODENSUN1130-4.014.01140.0NaNNaN1334NaN0.01.0124.0NaNNaN557.00Friday03
355425332018-10-07HALIHHNL1042-7.08.01043.01103.07.01119-9.00.00.037.035.020.0102.00Sunday40
4273442018-01-02WNMCOALB13352.017.01354.01613.02.016150.00.00.0160.0158.0139.01073.00Tuesday01
512100042018-03-07OOMLIMSP645-8.033.0710.0809.09.08117.00.00.086.0101.059.0274.01Wednesday10
649858952018-09-09DLJACSLC7001.014.0715.0753.05.0805-7.00.00.065.057.038.0205.00Sunday36
738641752018-07-16DLSTTATL14422.011.01455.01809.010.01843-24.00.00.0241.0215.0194.01599.00Monday29
874392018-01-01OOTUSLAX910-10.017.0917.0932.07.01004-25.00.00.0114.099.075.0451.00Monday01
97991192018-02-13WNPHXOAK130010.09.01319.01356.04.01405-5.00.00.0125.0110.097.0646.00Tuesday07

Last rows

df_indexfl_dateop_carrierorigindestcrs_dep_timedep_delaytaxi_outwheels_offwheels_ontaxi_incrs_arr_timearr_delaycancelleddivertedcrs_elapsed_timeactual_elapsed_timeair_timedistancedelayeddayweek
72133460654372018-11-02OOSBPLAX1657-9.020.01708.01745.014.01810-11.00.00.073.071.037.0156.00Friday44
72133513539602018-03-14WNMDWOAK2015-2.010.02023.02229.06.02255-20.00.00.0280.0262.0246.01844.00Wednesday11
72133637092502018-07-09WNPHXCMH1920152.06.02158.0433.06.0155164.00.00.0215.0227.0215.01670.01Monday28
72133728480712018-05-28AABOSLGA1600-5.016.01611.01653.011.01724-20.00.00.084.069.042.0184.00Monday22
72133814658062018-03-20B6FLLSYR200523.053.02121.02353.02.0230748.00.00.0182.0207.0152.01197.01Tuesday12
72133942374442018-08-03AACLTMCO2210NaNNaNNaNNaNNaN2345NaN1.00.095.0NaNNaN468.00Friday31
7213406604762018-02-06EVDFWLBB845-5.09.0849.0937.04.0958-17.00.00.073.061.048.0282.00Tuesday06
7213417056942018-02-08WNSMFLAX1405-7.014.01412.01511.07.01530-12.00.00.085.080.059.0373.00Thursday06
72134219501052018-04-13WNEWROAK171031.030.01811.02057.06.0202538.00.00.0375.0382.0346.02555.01Friday15
72134361993382018-11-09DLPHXMSP915-4.012.0923.01305.04.01332-23.00.00.0197.0178.0162.01276.00Friday45